shap score
Rigorous Feature Importance Scores based on Shapley Value and Banzhaf Index
Huang, Xuanxiang, Létoffé, Olivier, Marques-Silva, Joao
Feature attribution methods based on game theory are ubiquitous in the field of eXplainable Artificial Intelligence (XAI). Recent works proposed rigorous feature attribution using logic-based explanations, specifically targeting high-stakes uses of machine learning (ML) models. Typically, such works exploit weak abductive explanation (WAXp) as the characteristic function to assign importance to features. However, one possible downside is that the contribution of non-WAXp sets is neglected. In fact, non-WAXp sets can also convey important information, because of the relationship between formal explanations (XPs) and adversarial examples (AExs). Accordingly, this paper leverages Shapley value and Banzhaf index to devise two novel feature importance scores. We take into account non-WAXp sets when computing feature contribution, and the novel scores quantify how effective each feature is at excluding AExs. Furthermore, the paper identifies properties and studies the computational complexity of the proposed scores.
- Asia > Singapore (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.69)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Consistency of Feature Attribution in Deep Learning Architectures for Multi-Omics
Claborne, Daniel, Flores, Javier, Erwin, Samantha, Durell, Luke, Richardson, Rachel, Fore, Ruby, Bramer, Lisa
Machine and deep learning have grown in popularity and use in biological research over the last decade but still present challenges in interpretability of the fitted model. The development and use of metrics to determine features driving predictions and increase model i nterpretability continues to be an open area of research. We investigate the use of Shapley Additive Explanations (SHAP) on a multi - view deep learning model applied to multi - omics data for the purposes of identifying biomolecules of interest . Rankings of features via these attribution methods are compared across various architectures to evaluate consistency of the method. We perform multiple computational experiments to assess the robustness of SHAP and investigate modeling approaches and diagnostics to increase and measure the reliability of the identification of important features. Accuracy of a random - forest model fit on subsets of features selected as being most influential as well as clustering quality using o nly these features are used as a measure of enullectiveness of the attribution method. Our findings indicate that the rankings of features resulting from SHAP are sensitive to the choice of architecture as well as dinullerent random initializations of weights, suggesting caution when u sing attribution methods on multi - view deep learning models applied to multi - omics data. We present a n alternative, simple method to assess the robustness of identification of important biomolecules.
- North America > United States > California (0.04)
- Europe > Czechia > Prague (0.04)
When is the Computation of a Feature Attribution Method Tractable?
Barceló, P., Cominetti, R., Morgado, M.
Feature attribution methods have become essential for explaining machine learning models. Many popular approaches, such as SHAP and Banzhaf values, are grounded in power indices from cooperative game theory, which measure the contribution of features to model predictions. This work studies the computational complexity of power indices beyond SHAP, addressing the conditions under which they can be computed efficiently. We identify a simple condition on power indices that ensures that computation is polynomially equivalent to evaluating expected values, extending known results for SHAP. We also introduce Bernoulli power indices, showing that their computation can be simplified to a constant number of expected value evaluations. Furthermore, we explore interaction power indices that quantify the importance of feature subsets, proving that their computation complexity mirrors that of individual features.
Explainability is Not a Game
The societal and economic significance of machine learning (ML) cannot be overstated, with many remarkable advances made in recent years. However, the operation of complex ML models is most often inscrutable, with the consequence that decisions taken by ML models cannot be fathomed by human decision makers. It is therefore of importance to devise automated approaches to explain the predictions made by complex ML models. This is the main motivation for eXplainable AI (XAI). Explanations thus serve to build trust, but also to debug complex systems of AI.
Logic-Based Explainability: Past, Present & Future
In recent years, the impact of machine learning (ML) and artificial intelligence (AI) in society has been absolutely remarkable. This impact is expected to continue in the foreseeable future. However,the adoption of AI/ML is also a cause of grave concern. The operation of the most advances AI/ML models is often beyond the grasp of human decision makers. As a result, decisions that impact humans may not be understood and may lack rigorous validation. Explainable AI (XAI) is concerned with providing human decision-makers with understandable explanations for the predictions made by ML models. As a result, XAI is a cornerstone of trustworthy AI. Despite its strategic importance, most work on XAI lacks rigor, and so its use in high-risk or safety-critical domains serves to foster distrust instead of contributing to build much-needed trust. Logic-based XAI has recently emerged as a rigorous alternative to those other non-rigorous methods of XAI. This paper provides a technical survey of logic-based XAI, its origins, the current topics of research, and emerging future topics of research. The paper also highlights the many myths that pervade non-rigorous approaches for XAI.
- North America > United States (0.46)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Spain > Catalonia > Lleida Province > Lleida (0.04)
- Research Report (0.50)
- Overview (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.87)
- (2 more...)
On the Tractability of SHAP Explanations under Markovian Distributions
Marzouk, Reda, de La Higuera, Colin
Thanks to its solid theoretical foundation, the SHAP framework is arguably one the most widely utilized frameworks for local explainability of ML models. Despite its popularity, its exact computation is known to be very challenging, proven to be NP-Hard in various configurations. Recent works have unveiled positive complexity results regarding the computation of the SHAP score for specific model families, encompassing decision trees, random forests, and some classes of boolean circuits. Yet, all these positive results hinge on the assumption of feature independence, often simplistic in real-world scenarios. In this article, we investigate the computational complexity of the SHAP score by relaxing this assumption and introducing a Markovian perspective. We show that, under the Markovian assumption, computing the SHAP score for the class of Weighted automata, Disjoint DNFs and Decision Trees can be performed in polynomial time, offering a first positive complexity result for the problem of SHAP score computation that transcends the limitations of the feature independence assumption.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > France > Pays de la Loire > Loire-Atlantique > Nantes (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.74)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
From SHAP Scores to Feature Importance Scores
Letoffe, Olivier, Huang, Xuanxiang, Asher, Nicholas, Marques-Silva, Joao
A central goal of eXplainable Artificial Intelligence (XAI) is to assign relative importance to the features of a Machine Learning (ML) model given some prediction. The importance of this task of explainability by feature attribution is illustrated by the ubiquitous recent use of tools such as SHAP and LIME. Unfortunately, the exact computation of feature attributions, using the game-theoretical foundation underlying SHAP and LIME, can yield manifestly unsatisfactory results, that tantamount to reporting misleading relative feature importance. Recent work targeted rigorous feature attribution, by studying axiomatic aggregations of features based on logic-based definitions of explanations by feature selection. This paper shows that there is an essential relationship between feature attribution and a priori voting power, and that those recently proposed axiomatic aggregations represent a few instantiations of the range of power indices studied in the past. Furthermore, it remains unclear how some of the most widely used power indices might be exploited as feature importance scores (FISs), i.e. the use of power indices in XAI, and which of these indices would be the best suited for the purposes of XAI by feature attribution, namely in terms of not producing results that could be deemed as unsatisfactory. This paper proposes novel desirable properties that FISs should exhibit. In addition, the paper also proposes novel FISs exhibiting the proposed properties. Finally, the paper conducts a rigorous analysis of the best-known power indices in terms of the proposed properties.
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)
On Correcting SHAP Scores
Letoffe, Olivier, Huang, Xuanxiang, Marques-Silva, Joao
Recent work uncovered examples of classifiers for which SHAP scores yield misleading feature attributions. While such examples might be perceived as suggesting the inadequacy of Shapley values for explainability, this paper shows that the source of the identified shortcomings of SHAP scores resides elsewhere. Concretely, the paper makes the case that the failings of SHAP scores result from the characteristic functions used in earlier works. Furthermore, the paper identifies a number of properties that characteristic functions ought to respect, and proposes several novel characteristic functions, each exhibiting one or more of the desired properties. More importantly, some of the characteristic functions proposed in this paper are guaranteed not to exhibit any of the shortcomings uncovered by earlier work. The paper also investigates the impact of the new characteristic functions on the complexity of computing SHAP scores. Finally, the paper proposes modifications to the tool SHAP to use instead one of our novel characteristic functions, thereby eliminating some of the limitations reported for SHAP scores.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Spain > Catalonia > Lleida Province > Lleida (0.04)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
- Asia > Singapore (0.04)
The Distributional Uncertainty of the SHAP score in Explainable Machine Learning
Cifuentes, Santiago, Bertossi, Leopoldo, Pardal, Nina, Abriola, Sergio, Martinez, Maria Vanina, Romero, Miguel
Attribution scores reflect how important the feature values in an input entity are for the output of a machine learning model. One of the most popular attribution scores is the SHAP score, which is an instantiation of the general Shapley value used in coalition game theory. The definition of this score relies on a probability distribution on the entity population. Since the exact distribution is generally unknown, it needs to be assigned subjectively or be estimated from data, which may lead to misleading feature scores. In this paper, we propose a principled framework for reasoning on SHAP scores under unknown entity population distributions. In our framework, we consider an uncertainty region that contains the potential distributions, and the SHAP score of a feature becomes a function defined over this region. We study the basic problems of finding maxima and minima of this function, which allows us to determine tight ranges for the SHAP scores of all features. In particular, we pinpoint the complexity of these problems, and other related ones, showing them to be NP-complete. Finally, we present experiments on a real-world dataset, showing that our framework may contribute to a more robust feature scoring.
- North America > United States > California (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- South America > Argentina (0.04)
- (2 more...)
Efficient Computation of Shap Explanation Scores for Neural Network Classifiers via Knowledge Compilation
Bertossi, Leopoldo, Leon, Jorge E.
The use of Shap scores has become widespread in Explainable AI. However, their computation is in general intractable, in particular when done with a black-box classifier, such as neural network. Recent research has unveiled classes of open-box Boolean Circuit classifiers for which Shap can be computed efficiently. We show how to transform binary neural networks into those circuits for efficient Shap computation. We use logic-based knowledge compilation techniques. The performance gain is huge, as we show in the light of our experiments.
- North America > United States > California (0.05)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > Mexico > Gulf of Mexico (0.04)
- (3 more...)